# Required packages for our course. Do not delete.
library(tidyverse)
## Warning: package 'tidyverse' was built under R version 4.3.3
## Warning: package 'lubridate' was built under R version 4.3.3
library(mosaic)
library(plotly)
## Warning: package 'plotly' was built under R version 4.3.3
library(lubridate)
library(leaflet)
## Warning: package 'leaflet' was built under R version 4.3.3
knitr::include_graphics("C:\\Users\\shrir\\OneDrive\\Desktop\\download.jpg")
Embarking on this comprehensive analysis, I aim to unravel the multifaceted dynamics of the Indian Premier League (IPL) from 2008 to 2023. My goal is to understand the evolution of team performances, identify the key factors that contribute to match victories, assess the impact of individual players on game outcomes, and explore how venues and match conditions influence these aspects.
Performance Trends Over Time I will begin by examining the win-loss records of teams across the years. Using time series analysis, I’ll track performance trends and changes in team strategies. Line graphs and bar charts in R will help visualize these trends and highlight significant shifts in team performances.
Key Factors for Winning Next, I’ll delve into the factors that lead to match victories. By conducting regression analysis and possibly logistic regression (for binary outcomes like win or loss), I aim to uncover whether winning the toss, player performance, or other variables significantly affect the match outcome. R’s statistical modeling capabilities will be instrumental here.
Player Impact To gauge individual contributions, I’ll analyze player statistics using R’s data manipulation and analysis packages like dplyr and tidyr. By computing metrics such as run rates, wickets, and player efficiency, I’ll pinpoint key players and their impact on the game’s results, utilizing scatter plots and correlation matrices for visualization.
Venue Influence Investigating the role of venues, I’ll compare match outcomes across different locations, considering factors like home advantage and pitch conditions. Chi-square tests for categorical data analysis might be useful to determine if the venue significantly influences the match result.
Throughout this project, I’ll rely on R’s comprehensive ecosystem of packages like ggplot2 for data visualization.
The dataset I’m working with is a collection of Indian Premier League (IPL) match data from 2008 to 2023, which I obtained from Kaggle. Kaggle is a popular platform for data science and machine learning that provides a wide range of datasets. This particular dataset encapsulates detailed information about each IPL match, including match dates, participating teams, venues, scores, and individual player performances. It’s a comprehensive set, ideal for conducting in-depth analyses of trends, performance metrics, and predictive modeling within the realm of cricket analytics.
Include the code to load your data here. Using the head() function, show the first 10 rows of data.
file_path <- "C:/Users/shrir/OneDrive/Desktop/Ipl-clean-data2008-2023.csv"
# Load the dataset
ipl <- read.csv(file_path, header = TRUE, stringsAsFactors = FALSE)
# Display the first 10 rows using the head() function
head(ipl, 10)
| X | season | id | name | short_name | description | home_team | away_team | toss_won | decision | X1st_inning_score | X2nd_inning_score | winner | result | start_date | end_date | venue_id | venue_name | home_captain | away_captain | pom | points | super_over | home_overs | home_runs | home_wickets | home_boundaries | away_overs | away_runs | away_wickets | away_boundaries | highlights | home_key_batsman | home_key_bowler | home_playx1 | away_playx1 | away_key_batsman | away_key_bowler | match_days | umpire1 | umpire2 | tv_umpire | referee | reserve_umpire |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 2023 | 1359544 | Royal Challengers Bangalore v Gujarat Titans | RCB v GT | 70th Match (N), Indian Premier League at Bengaluru, May 21 2023 | RCB | GT | GT | BOWL FIRST | 197/5 | 198/4 | GT | Titans won by 6 wkts (5b rem) | 2023-05-21T14:00Z | 2023-05-22T23:59Z | 57897 | M.Chinnaswamy Stadium, Bengaluru | Faf du Plessis | Hardik Pandya | Shubman Gill | Gujarat Titans 2, Royal Challengers Bangalore 0 | False | 20 | 197 | 5 | 28 | 19.1 | 198 | 4 | 25 | Gill’s second straight century trumps Kohli’s to knock RCB out of playoffs race. Gill’s unbeaten 104 featured seven more sixes than Kohli’s 101 not out | Virat Kohli,Faf du Plessis | Mohammed Siraj,Harshal Patel | Virat Kohli (UKN),Faf du Plessis (UKN),Glenn Maxwell (AR),Mahipal Lomror (AR),Michael Bracewell (AR),Dinesh Karthik (UKN),Anuj Rawat (WK),Harshal Patel (BL),Wayne Parnell (BL),Vijaykumar Vyshak (BL),Mohammed Siraj (BL),Himanshu Sharma (BL) | Wriddhiman Saha (WK),Shubman Gill (UKN),Vijay Shankar (AR),Dasun Shanaka (AR),David Miller (UKN),Rahul Tewatia (AR),Hardik Pandya (AR),Rashid Khan (AR),Noor Ahmad (BL),Mohammed Shami (BL),Mohit Sharma (BL),Yash Dayal (BL) | Shubman Gill,Vijay Shankar | Noor Ahmad,Rashid Khan | 21 May 2023 - night match (20-over match) | Nitin Menon | Virender Sharma | Tapan Sharma | Javagal Srinath | VM Dhokre |
| 5 | 2023 | 1359543 | Mumbai Indians v Sunrisers Hyderabad | MI v SRH | 69th Match (D/N), Indian Premier League at Mumbai, May 21 2023 | MI | SRH | MI | BOWL FIRST | 200/5 | 201/2 | MI | Mumbai won by 8 wkts (12b rem) | 2023-05-21T10:00Z | 2023-05-22T23:59Z | 58324 | Wankhede Stadium, Mumbai | Rohit Sharma | Aiden Markram | Cameron Green | Mumbai Indians 2, Sunrisers Hyderabad 0 | False | 18 | 201 | 2 | 31 | 20.0 | 200 | 5 | 27 | Green century and Madhwal four-for help Mumbai Indians finish fourth. Mumbai complete fourth 200+ chase this season and set up eliminator date with LSG | Cameron Green,Rohit Sharma | Akash Madhwal,Chris Jordan | Ishan Kishan (WK),Rohit Sharma (UKN),Cameron Green (AR),Suryakumar Yadav (UKN),Tim David (UKN),Nehal Wadhera (UKN),Chris Jordan (BL),Piyush Chawla (AR),Jason Behrendorff (BL),Kumar Kartikeya (BL),Akash Madhwal (BL) | Vivrant Sharma (AR),Mayank Agarwal (UKN),Heinrich Klaasen (WK),Glenn Phillips (UKN),Aiden Markram (UKN),Harry Brook (UKN),Sanvir Singh (AR),Nitish Kumar Reddy (AR),Mayank Dagar (BL),Bhuvneshwar Kumar (BL),Umran Malik (BL),Kartik Tyagi (BL) | Mayank Agarwal,Vivrant Sharma | Bhuvneshwar Kumar,Mayank Dagar | 21 May 2023 - day/night match (20-over match) | KN Ananthapadmanabhan | Rod Tucker | Rohan Pandit | Pankaj Dharmani | Parashar Joshi |
| 6 | 2023 | 1359542 | Kolkata Knight Riders v Lucknow Super Giants | KKR v LSG | 68th Match (N), Indian Premier League at Kolkata, May 20 2023 | KKR | LSG | KKR | BOWL FIRST | 176/8 | 175/7 | LSG | Super Giants won by 1 run | 2023-05-20T14:00Z | 2023-05-21T23:59Z | 57980 | Eden Gardens, Kolkata | Nitish Rana | Krunal Pandya | Nicholas Pooran | Lucknow Super Giants 2, Kolkata Knight Riders 0 | False | 20 | 175 | 7 | 24 | 20.0 | 176 | 8 | 22 | Pooran, Bishnoi seal Lucknow Super Giants’ playoffs spot with thrilling one-run win. Rinku Singh nearly pulled off a stunning chase as KKR’s campaign came to an end | Rinku Singh,Jason Roy | Sunil Narine,Shardul Thakur | Jason Roy (UKN),Venkatesh Iyer (AR),Nitish Rana (UKN),Rahmanullah Gurbaz (WK),Rinku Singh (UKN),Andre Russell (AR),Shardul Thakur (BL),Sunil Narine (AR),Vaibhav Arora (BL),Varun Chakravarthy (BL),Suyash Sharma (BL),Harshit Rana (BL) | Karan Sharma (AR),Quinton de Kock (WK),Prerak Mankad (AR),Marcus Stoinis (AR),Krunal Pandya (AR),Ayush Badoni (UKN),Nicholas Pooran (UKN),Krishnappa Gowtham (AR),Ravi Bishnoi (BL),Naveen-ul-Haq (BL),Mohsin Khan (BL),Yash Thakur (BL) | Nicholas Pooran,Quinton de Kock | Ravi Bishnoi,Yash Thakur | 20 May 2023 - night match (20-over match) | Ulhas Gandhe | Jayaraman Madanagopal | Yeshwant Barde | Manu Nayyar | Mohamed Rafi |
| 7 | 2023 | 1359541 | Delhi Capitals v Chennai Super Kings | DC v CSK | 67th Match (D/N), Indian Premier League at Delhi, May 20 2023 | DC | CSK | CSK | BAT FIRST | 223/3 | 146/9 | CSK | Super Kings won by 77 runs | 2023-05-20T10:00Z | 2023-05-21T23:59Z | 58040 | Arun Jaitley Stadium, Delhi | David Warner | MS Dhoni | Ruturaj Gaikwad | Chennai Super Kings 2, Delhi Capitals 0 | False | 20 | 146 | 9 | 17 | 20.0 | 223 | 3 | 31 | Gaikwad, Conway script CSK’s big win. For Delhi Capitals, the end was as tame as the start of the season | David Warner,Axar Patel | Chetan Sakariya,Anrich Nortje | Prithvi Shaw (UKN),David Warner (UKN),Phil Salt (WK),Rilee Rossouw (UKN),Yash Dhull (UKN),Axar Patel (AR),Aman Hakim Khan (AR),Lalit Yadav (AR),Anrich Nortje (BL),Kuldeep Yadav (BL),Chetan Sakariya (BL),Khaleel Ahmed (BL) | Ruturaj Gaikwad (UKN),Devon Conway (UKN),Shivam Dube (AR),MS Dhoni (WK),Ravindra Jadeja (AR),Ajinkya Rahane (UKN),Moeen Ali (AR),Ambati Rayudu (UKN),Deepak Chahar (BL),Tushar Deshpande (BL),Maheesh Theekshana (BL),Matheesha Pathirana (BL) | Devon Conway,Ruturaj Gaikwad | Deepak Chahar,Matheesha Pathirana | 20 May 2023 - day/night match (20-over match) | Chris Gaffaney | Nikhil Patwardhan | Anil Chaudhary | Sanjay Verma | Mohit Krishnadas |
| 8 | 2023 | 1359540 | Punjab Kings v Rajasthan Royals | PBKS v RR | 66th Match (N), Indian Premier League at Dharamsala, May 19 2023 | PBKS | RR | RR | BOWL FIRST | 187/5 | 189/6 | RR | Royals won by 4 wkts (2b rem) | 2023-05-19T14:00Z | 2023-05-20T23:59Z | 58056 | Himachal Pradesh Cricket Association Stadium, Dharamsala | Shikhar Dhawan | Sanju Samson | Devdutt Padikkal | Rajasthan Royals 2, Punjab Kings 0 | False | 20 | 187 | 5 | 26 | 19.4 | 189 | 6 | 27 | Padikkal, Hetmyer keep Royals in the hunt with last over-win. They now rely on RCB and Mumbai Indians to make the playoffs; Kings have been eliminated | Sam Curran,Jitesh Sharma | Kagiso Rabada,Rahul Chahar | Prabhsimran Singh (UKN),Shikhar Dhawan (UKN),Atharva Taide (AR),Liam Livingstone (AR),Sam Curran (AR),Jitesh Sharma (WK),M Shahrukh Khan (UKN),Harpreet Brar (BL),Rahul Chahar (BL),Kagiso Rabada (BL),Arshdeep Singh (BL),Nathan Ellis (BL) | Yashasvi Jaiswal (UKN),Jos Buttler (UKN),Devdutt Padikkal (UKN),Sanju Samson (WK),Shimron Hetmyer (UKN),Riyan Parag (UKN),Dhruv Jurel (UKN),Trent Boult (BL),Adam Zampa (BL),Navdeep Saini (BL),Sandeep Sharma (BL),Yuzvendra Chahal (BL) | Devdutt Padikkal,Yashasvi Jaiswal | Navdeep Saini,Adam Zampa | 19 May 2023 - night match (20-over match) | Nand Kishore | Rod Tucker | Navdeep Singh | Pankaj Dharmani | Parashar Joshi |
| 9 | 2023 | 1359539 | Sunrisers Hyderabad v Royal Challengers Bangalore | SRH v RCB | 65th Match (N), Indian Premier League at Hyderabad, May 18 2023 | SRH | RCB | RCB | BOWL FIRST | 186/5 | 187/2 | RCB | RCB won by 8 wkts (4b rem) | 2023-05-18T14:00Z | 2023-05-19T23:59Z | 58142 | Rajiv Gandhi International Stadium, Uppal, Hyderabad | Aiden Markram | Faf du Plessis | Virat Kohli | Royal Challengers Bangalore 2, Sunrisers Hyderabad 0 | False | 20 | 186 | 5 | 23 | 19.2 | 187 | 2 | 26 | Clinical Kohli, du Plessis keep RCB’s fate in their hands. The opening pair bossed the chase after Klaasen’s masterly 104 from 51 balls set a target of 187 | Heinrich Klaasen,Harry Brook | T Natarajan,Bhuvneshwar Kumar | Abhishek Sharma (AR),Rahul Tripathi (UKN),Aiden Markram (UKN),Heinrich Klaasen (WK),Harry Brook (UKN),Glenn Phillips (UKN),Abdul Samad (UKN),Bhuvneshwar Kumar (BL),Kartik Tyagi (BL),Mayank Dagar (BL),Nitish Kumar Reddy (AR),T Natarajan (BL) | Virat Kohli (UKN),Faf du Plessis (UKN),Glenn Maxwell (AR),Michael Bracewell (AR),Mahipal Lomror (AR),Anuj Rawat (WK),Shahbaz Ahmed (AR),Harshal Patel (BL),Wayne Parnell (BL),Karn Sharma (BL),Mohammed Siraj (BL) | Virat Kohli,Faf du Plessis | Michael Bracewell,Mohammed Siraj | 18 May 2023 - night match (20-over match) | Bruce Oxenford | Virender Sharma | Saiyed Khalid | Javagal Srinath | VM Dhokre |
| 10 | 2023 | 1359538 | Punjab Kings v Delhi Capitals | PBKS v DC | 64th Match (N), Indian Premier League at Dharamsala, May 17 2023 | PBKS | DC | PBKS | BOWL FIRST | 213/2 | 198/8 | DC | Capitals won by 15 runs | 2023-05-17T14:00Z | 2023-05-18T23:59Z | 58056 | Himachal Pradesh Cricket Association Stadium, Dharamsala | Shikhar Dhawan | David Warner | Rilee Rossouw | Delhi Capitals 2, Punjab Kings 0 | False | 20 | 198 | 8 | 28 | 20.0 | 213 | 2 | 31 | Rossouw blitz puts Punjab Kings on brink of elimination. Despite Livingstone’s 94 off 48, Kings fall short in their chase of 214 | Liam Livingstone,Atharva Taide | Sam Curran,Arshdeep Singh | Prabhsimran Singh (UKN),Shikhar Dhawan (UKN),Atharva Taide (AR),Liam Livingstone (AR),Jitesh Sharma (WK),M Shahrukh Khan (UKN),Sam Curran (AR),Harpreet Brar (BL),Rahul Chahar (BL),Kagiso Rabada (BL),Arshdeep Singh (BL),Nathan Ellis (BL) | David Warner (UKN),Prithvi Shaw (UKN),Rilee Rossouw (UKN),Phil Salt (WK),Axar Patel (AR),Aman Hakim Khan (AR),Yash Dhull (UKN),Kuldeep Yadav (BL),Anrich Nortje (BL),Ishant Sharma (BL),Khaleel Ahmed (BL),Mukesh Kumar (BL) | Rilee Rossouw,Prithvi Shaw | Anrich Nortje,Ishant Sharma | 17 May 2023 - night match (20-over match) | KN Ananthapadmanabhan | Saidharshan Kumar | Rod Tucker | Pankaj Dharmani | Parashar Joshi |
| 11 | 2023 | 1359537 | Lucknow Super Giants v Mumbai Indians | LSG v MI | 63rd Match (N), Indian Premier League at Lucknow, May 16 2023 | LSG | MI | MI | BOWL FIRST | 177/3 | 172/5 | LSG | Super Giants won by 5 runs | 2023-05-16T14:00Z | 2023-05-17T23:59Z | 1070094 | Bharat Ratna Shri Atal Bihari Vajpayee Ekana Cricket Stadium, Lucknow | Krunal Pandya | Rohit Sharma | Marcus Stoinis | Lucknow Super Giants 2, Mumbai Indians 0 | False | 20 | 177 | 3 | 17 | 20.0 | 172 | 5 | 19 | Marcus Stoinis brings the muscle in Lucknow’s thrilling victory. Mumbai were in it until the very end but Mohsin Khan bowled a sensational final over | Marcus Stoinis,Krunal Pandya | Ravi Bishnoi,Yash Thakur | Deepak Hooda (AR),Quinton de Kock (WK),Prerak Mankad (AR),Krunal Pandya (AR),Marcus Stoinis (AR),Nicholas Pooran (UKN),Ayush Badoni (UKN),Naveen-ul-Haq (BL),Ravi Bishnoi (BL),Swapnil Singh (BL),Mohsin Khan (BL),Yash Thakur (BL) | Ishan Kishan (WK),Rohit Sharma (UKN),Suryakumar Yadav (UKN),Nehal Wadhera (UKN),Tim David (UKN),Vishnu Vinod (UKN),Cameron Green (AR),Chris Jordan (BL),Hrithik Shokeen (BL),Piyush Chawla (AR),Jason Behrendorff (BL),Akash Madhwal (BL) | Ishan Kishan,Rohit Sharma | Jason Behrendorff,Piyush Chawla | 16 May 2023 - night match (20-over match) | Anil Chaudhary | Nand Kishore | Chris Gaffaney | Sanjay Verma | Mohit Krishnadas |
| 12 | 2023 | 1359536 | Gujarat Titans v Sunrisers Hyderabad | GT v SRH | 62nd Match (N), Indian Premier League at Ahmedabad, May 15 2023 | GT | SRH | SRH | BOWL FIRST | 188/9 | 154/9 | GT | Titans won by 34 runs | 2023-05-15T14:00Z | 2023-05-16T23:59Z | 57851 | Narendra Modi Stadium, Motera, Ahmedabad | Hardik Pandya | Aiden Markram | Shubman Gill | Gujarat Titans 2, Sunrisers Hyderabad 0 | False | 20 | 188 | 9 | 24 | 20.0 | 154 | 9 | 18 | Gill and Shami seal top-two finish for Titans. Mohit Sharma also took four wickets to knock Sunrisers out of the playoffs race | Shubman Gill,Sai Sudharsan | Mohammed Shami,Mohit Sharma | Wriddhiman Saha (WK),Shubman Gill (UKN),Sai Sudharsan (UKN),Hardik Pandya (AR),David Miller (UKN),Rahul Tewatia (AR),Dasun Shanaka (AR),Rashid Khan (AR),Noor Ahmad (BL),Mohammed Shami (BL),Mohit Sharma (BL),Yash Dayal (BL) | Anmolpreet Singh (UKN),Abhishek Sharma (AR),Aiden Markram (UKN),Rahul Tripathi (UKN),Heinrich Klaasen (WK),Sanvir Singh (AR),Abdul Samad (UKN),Marco Jansen (AR),Bhuvneshwar Kumar (BL),Mayank Markande (BL),Fazalhaq Farooqi (BL),T Natarajan (BL) | Heinrich Klaasen,Bhuvneshwar Kumar | Bhuvneshwar Kumar,Fazalhaq Farooqi | 15 May 2023 - night match (20-over match) | Ulhas Gandhe | Jayaraman Madanagopal | Akshay Totre | Manu Nayyar | Mohamed Rafi |
| 13 | 2023 | 1359535 | Chennai Super Kings v Kolkata Knight Riders | CSK v KKR | 61st Match (N), Indian Premier League at Chennai, May 14 2023 | CSK | KKR | CSK | BAT FIRST | 144/6 | 147/4 | KKR | KKR won by 6 wkts (9b rem) | 2023-05-14T14:00Z | 2023-05-15T23:59Z | 58008 | MA Chidambaram Stadium, Chepauk, Chennai | MS Dhoni | Nitish Rana | Rinku Singh | Kolkata Knight Riders 2, Chennai Super Kings 0 | False | 20 | 144 | 6 | 12 | 18.3 | 147 | 4 | 17 | Rinku Singh, Nitish Rana silence Chepauk and keep Kolkata Knight Riders alive. Narine hits form too, meaning a question mark still hangs over CSK’s playoff spot | Shivam Dube,Devon Conway | Deepak Chahar,Tushar Deshpande | Ruturaj Gaikwad (UKN),Devon Conway (UKN),Ajinkya Rahane (UKN),Ambati Rayudu (UKN),Shivam Dube (AR),Moeen Ali (AR),Ravindra Jadeja (AR),MS Dhoni (WK),Deepak Chahar (BL),Tushar Deshpande (BL),Maheesh Theekshana (BL),Matheesha Pathirana (BL) | Jason Roy (UKN),Rahmanullah Gurbaz (WK),Venkatesh Iyer (AR),Nitish Rana (UKN),Rinku Singh (UKN),Andre Russell (AR),Shardul Thakur (BL),Sunil Narine (AR),Vaibhav Arora (BL),Harshit Rana (BL),Varun Chakravarthy (BL),Suyash Sharma (BL) | Nitish Rana,Rinku Singh | Sunil Narine,Varun Chakravarthy | 14 May 2023 - night match (20-over match) | Tapan Sharma | Vinod Seshan | Nitin Menon | Javagal Srinath | VM Dhokre |
Using the names() function, show the names of all the columns (i.e. potential variables) in your data set. Delete this when complete.
names(ipl)
## [1] "X" "season" "id"
## [4] "name" "short_name" "description"
## [7] "home_team" "away_team" "toss_won"
## [10] "decision" "X1st_inning_score" "X2nd_inning_score"
## [13] "winner" "result" "start_date"
## [16] "end_date" "venue_id" "venue_name"
## [19] "home_captain" "away_captain" "pom"
## [22] "points" "super_over" "home_overs"
## [25] "home_runs" "home_wickets" "home_boundaries"
## [28] "away_overs" "away_runs" "away_wickets"
## [31] "away_boundaries" "highlights" "home_key_batsman"
## [34] "home_key_bowler" "home_playx1" "away_playx1"
## [37] "away_key_batsman" "away_key_bowler" "match_days"
## [40] "umpire1" "umpire2" "tv_umpire"
## [43] "referee" "reserve_umpire"
The variables I used in my infographic design are:
#season - The IPL season year, which indicates the timeline of each match.
#home_team - The team playing on its home ground for the match.
#away_team - The team playing away from its home ground.
#winner - The team that won the match.
#pom (Player of the Match) - The player who performed exceptionally well and was awarded the Player of the Match.
Using the favstats() function, calculate the necessary statistics you used to create your data visualization. Delete this when complete.
# Calculate favorite statistics for home runs
home_runs_stats <- favstats(~ home_runs, data = ipl)
print(home_runs_stats)
## min Q1 median Q3 max mean sd n missing
## 58 138.5 160 180 263 158.7464 31.66522 899 0
# Calculate favorite statistics for away runs
away_runs_stats <- favstats(~ away_runs, data = ipl)
print(away_runs_stats)
## min Q1 median Q3 max mean sd n missing
## 44 136 158 176 257 155.8142 31.23854 899 0
# Display the structure of the dataset
str(ipl)
## 'data.frame': 899 obs. of 44 variables:
## $ X : int 4 5 6 7 8 9 10 11 12 13 ...
## $ season : num 2023 2023 2023 2023 2023 ...
## $ id : int 1359544 1359543 1359542 1359541 1359540 1359539 1359538 1359537 1359536 1359535 ...
## $ name : chr "Royal Challengers Bangalore v Gujarat Titans" "Mumbai Indians v Sunrisers Hyderabad" "Kolkata Knight Riders v Lucknow Super Giants" "Delhi Capitals v Chennai Super Kings" ...
## $ short_name : chr "RCB v GT" "MI v SRH" "KKR v LSG" "DC v CSK" ...
## $ description : chr "70th Match (N), Indian Premier League at Bengaluru, May 21 2023" "69th Match (D/N), Indian Premier League at Mumbai, May 21 2023" "68th Match (N), Indian Premier League at Kolkata, May 20 2023" "67th Match (D/N), Indian Premier League at Delhi, May 20 2023" ...
## $ home_team : chr "RCB" "MI" "KKR" "DC" ...
## $ away_team : chr "GT" "SRH" "LSG" "CSK" ...
## $ toss_won : chr "GT" "MI" "KKR" "CSK" ...
## $ decision : chr "BOWL FIRST" "BOWL FIRST" "BOWL FIRST" "BAT FIRST" ...
## $ X1st_inning_score: chr "197/5" "200/5" "176/8" "223/3" ...
## $ X2nd_inning_score: chr "198/4" "201/2" "175/7" "146/9" ...
## $ winner : chr "GT" "MI" "LSG" "CSK" ...
## $ result : chr "Titans won by 6 wkts (5b rem)" "Mumbai won by 8 wkts (12b rem)" "Super Giants won by 1 run" "Super Kings won by 77 runs" ...
## $ start_date : chr "2023-05-21T14:00Z" "2023-05-21T10:00Z" "2023-05-20T14:00Z" "2023-05-20T10:00Z" ...
## $ end_date : chr "2023-05-22T23:59Z" "2023-05-22T23:59Z" "2023-05-21T23:59Z" "2023-05-21T23:59Z" ...
## $ venue_id : int 57897 58324 57980 58040 58056 58142 58056 1070094 57851 58008 ...
## $ venue_name : chr "M.Chinnaswamy Stadium, Bengaluru" "Wankhede Stadium, Mumbai" "Eden Gardens, Kolkata" "Arun Jaitley Stadium, Delhi" ...
## $ home_captain : chr "Faf du Plessis" "Rohit Sharma" "Nitish Rana" "David Warner" ...
## $ away_captain : chr "Hardik Pandya" "Aiden Markram" "Krunal Pandya" "MS Dhoni" ...
## $ pom : chr "Shubman Gill" "Cameron Green" "Nicholas Pooran" "Ruturaj Gaikwad" ...
## $ points : chr "Gujarat Titans 2, Royal Challengers Bangalore 0" "Mumbai Indians 2, Sunrisers Hyderabad 0" "Lucknow Super Giants 2, Kolkata Knight Riders 0" "Chennai Super Kings 2, Delhi Capitals 0" ...
## $ super_over : chr "False" "False" "False" "False" ...
## $ home_overs : num 20 18 20 20 20 20 20 20 20 20 ...
## $ home_runs : num 197 201 175 146 187 186 198 177 188 144 ...
## $ home_wickets : num 5 2 7 9 5 5 8 3 9 6 ...
## $ home_boundaries : num 28 31 24 17 26 23 28 17 24 12 ...
## $ away_overs : num 19.1 20 20 20 19.4 19.2 20 20 20 18.3 ...
## $ away_runs : num 198 200 176 223 189 187 213 172 154 147 ...
## $ away_wickets : num 4 5 8 3 6 2 2 5 9 4 ...
## $ away_boundaries : num 25 27 22 31 27 26 31 19 18 17 ...
## $ highlights : chr "Gill's second straight century trumps Kohli's to knock RCB out of playoffs race. Gill's unbeaten 104 featured s"| __truncated__ "Green century and Madhwal four-for help Mumbai Indians finish fourth. Mumbai complete fourth 200+ chase this se"| __truncated__ "Pooran, Bishnoi seal Lucknow Super Giants' playoffs spot with thrilling one-run win. Rinku Singh nearly pulled "| __truncated__ "Gaikwad, Conway script CSK's big win. For Delhi Capitals, the end was as tame as the start of the season" ...
## $ home_key_batsman : chr "Virat Kohli,Faf du Plessis" "Cameron Green,Rohit Sharma" "Rinku Singh,Jason Roy" "David Warner,Axar Patel" ...
## $ home_key_bowler : chr "Mohammed Siraj,Harshal Patel" "Akash Madhwal,Chris Jordan" "Sunil Narine,Shardul Thakur" "Chetan Sakariya,Anrich Nortje" ...
## $ home_playx1 : chr "Virat Kohli (UKN),Faf du Plessis (UKN),Glenn Maxwell (AR),Mahipal Lomror (AR),Michael Bracewell (AR),Dinesh Kar"| __truncated__ "Ishan Kishan (WK),Rohit Sharma (UKN),Cameron Green (AR),Suryakumar Yadav (UKN),Tim David (UKN),Nehal Wadhera (U"| __truncated__ "Jason Roy (UKN),Venkatesh Iyer (AR),Nitish Rana (UKN),Rahmanullah Gurbaz (WK),Rinku Singh (UKN),Andre Russell ("| __truncated__ "Prithvi Shaw (UKN),David Warner (UKN),Phil Salt (WK),Rilee Rossouw (UKN),Yash Dhull (UKN),Axar Patel (AR),Aman "| __truncated__ ...
## $ away_playx1 : chr "Wriddhiman Saha (WK),Shubman Gill (UKN),Vijay Shankar (AR),Dasun Shanaka (AR),David Miller (UKN),Rahul Tewatia "| __truncated__ "Vivrant Sharma (AR),Mayank Agarwal (UKN),Heinrich Klaasen (WK),Glenn Phillips (UKN),Aiden Markram (UKN),Harry B"| __truncated__ "Karan Sharma (AR),Quinton de Kock (WK),Prerak Mankad (AR),Marcus Stoinis (AR),Krunal Pandya (AR),Ayush Badoni ("| __truncated__ "Ruturaj Gaikwad (UKN),Devon Conway (UKN),Shivam Dube (AR),MS Dhoni (WK),Ravindra Jadeja (AR),Ajinkya Rahane (UK"| __truncated__ ...
## $ away_key_batsman : chr "Shubman Gill,Vijay Shankar" "Mayank Agarwal,Vivrant Sharma" "Nicholas Pooran,Quinton de Kock" "Devon Conway,Ruturaj Gaikwad" ...
## $ away_key_bowler : chr "Noor Ahmad,Rashid Khan" "Bhuvneshwar Kumar,Mayank Dagar" "Ravi Bishnoi,Yash Thakur" "Deepak Chahar,Matheesha Pathirana" ...
## $ match_days : chr "21 May 2023 - night match (20-over match)" "21 May 2023 - day/night match (20-over match)" "20 May 2023 - night match (20-over match)" "20 May 2023 - day/night match (20-over match)" ...
## $ umpire1 : chr "Nitin Menon" "KN Ananthapadmanabhan" "Ulhas Gandhe" "Chris Gaffaney" ...
## $ umpire2 : chr "Virender Sharma" "Rod Tucker" "Jayaraman Madanagopal" "Nikhil Patwardhan" ...
## $ tv_umpire : chr "Tapan Sharma" "Rohan Pandit" "Yeshwant Barde" "Anil Chaudhary" ...
## $ referee : chr "Javagal Srinath" "Pankaj Dharmani" "Manu Nayyar" "Sanjay Verma" ...
## $ reserve_umpire : chr "VM Dhokre" "Parashar Joshi" "Mohamed Rafi" "Mohit Krishnadas" ...
# Get a summary for each column to identify potential data issues
summary(ipl)
## X season id name
## Min. : 4.0 Min. :2008 Min. : 335982 Length:899
## 1st Qu.: 244.5 1st Qu.:2012 1st Qu.: 548327 Class :character
## Median : 508.0 Median :2015 Median : 829815 Mode :character
## Mean : 502.5 Mean :2016 Mean : 878206
## 3rd Qu.: 754.5 3rd Qu.:2020 3rd Qu.:1216510
## Max. :1028.0 Max. :2023 Max. :1359544
## short_name description home_team away_team
## Length:899 Length:899 Length:899 Length:899
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## toss_won decision X1st_inning_score X2nd_inning_score
## Length:899 Length:899 Length:899 Length:899
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## winner result start_date end_date
## Length:899 Length:899 Length:899 Length:899
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## venue_id venue_name home_captain away_captain
## Min. : 57851 Length:899 Length:899 Length:899
## 1st Qu.: 57991 Class :character Class :character Class :character
## Median : 58150 Mode :character Mode :character Mode :character
## Mean : 126636
## 3rd Qu.: 59094
## Max. :1070094
## pom points super_over home_overs
## Length:899 Length:899 Length:899 Min. : 4.50
## Class :character Class :character Class :character 1st Qu.:19.20
## Mode :character Mode :character Mode :character Median :20.00
## Mean :19.12
## 3rd Qu.:20.00
## Max. :20.00
## home_runs home_wickets home_boundaries away_overs
## Min. : 58.0 Min. : 0.000 Min. : 3.00 Min. : 4.20
## 1st Qu.:138.5 1st Qu.: 4.000 1st Qu.:15.00 1st Qu.:19.20
## Median :160.0 Median : 6.000 Median :19.00 Median :20.00
## Mean :158.7 Mean : 5.752 Mean :19.48 Mean :19.09
## 3rd Qu.:180.0 3rd Qu.: 8.000 3rd Qu.:24.00 3rd Qu.:20.00
## Max. :263.0 Max. :10.000 Max. :42.00 Max. :20.00
## away_runs away_wickets away_boundaries highlights
## Min. : 44.0 Min. : 0.000 Min. : 4.00 Length:899
## 1st Qu.:136.0 1st Qu.: 4.000 1st Qu.:15.00 Class :character
## Median :158.0 Median : 6.000 Median :19.00 Mode :character
## Mean :155.8 Mean : 5.905 Mean :19.07
## 3rd Qu.:176.0 3rd Qu.: 8.000 3rd Qu.:23.00
## Max. :257.0 Max. :10.000 Max. :41.00
## home_key_batsman home_key_bowler home_playx1 away_playx1
## Length:899 Length:899 Length:899 Length:899
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## away_key_batsman away_key_bowler match_days umpire1
## Length:899 Length:899 Length:899 Length:899
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
## umpire2 tv_umpire referee reserve_umpire
## Length:899 Length:899 Length:899 Length:899
## Class :character Class :character Class :character Class :character
## Mode :character Mode :character Mode :character Mode :character
##
##
##
# Calculate win rates for home and away games
match_outcomes <- ipl %>%
mutate(home_win = ifelse(home_team == winner, 1, 0),
away_win = ifelse(away_team == winner, 1, 0)) %>%
group_by(home_team) %>%
summarise(total_matches = n(),
home_wins = sum(home_win, na.rm = TRUE)) %>%
ungroup() # Ungroup to make sure plotly can handle it
# Plotting home wins vs. away wins using ggplot2
g <- ggplot(match_outcomes, aes(x = home_team, y = home_wins, fill = home_team)) +
geom_bar(stat = "identity") +
labs(title = "Home Wins by Team", x = "Team", y = "Wins") +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "none") # Hiding the legend as the color is mapped to home_team
# Convert ggplot object to plotly for interactivity
interactive_plot <- ggplotly(g)
interactive_plot
Key takeaways
The bar chart depicts the win tallies for IPL teams, underscoring the consistent success of franchises like Mumbai Indians (MI) and Chennai Super Kings (CSK), which have a considerable lead in victories. It’s evident that Kochi and Pune Warriors India (PWI) have significantly fewer wins, which is partly explained by their no longer participating in the league, shortening their opportunity to accrue victories. Outliers on the chart indicate seasons that were either exceptionally good or poor for the teams they represent. The chart clearly delineates the longstanding prowess of certain teams in the IPL while also marking the transient presence and challenges faced by others.
toss_decisions <- ipl %>%
group_by(toss_won, decision, winner) %>%
summarise(wins = n(), .groups = 'drop') %>%
mutate(win_rate = wins / sum(wins))
g <- ggplot(toss_decisions, aes(x = toss_won, y = win_rate, fill = decision)) +
geom_bar(stat = "identity", position = "dodge") +
facet_wrap(~winner) +
labs(title = "Toss Decision Impact on Match Outcome", x = "Team that Won the Toss", y = "Win Rate") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
interactive_plot <- ggplotly(g)
interactive_plot
Key Takeaways
The visualization reveals patterns and trends regarding how winning the toss and the subsequent decision to bat or field correlates with winning matches in the IPL. By examining the win rates across teams that won the toss, and their choices, insights can be gleaned into whether there is a strategic advantage in winning the toss and if certain teams are more successful when they bat first or chase a target. The use of facet_wrap(~winner) suggests that the impact of toss decisions is being compared directly with match-winning outcomes, potentially indicating that for some teams, toss decisions have a greater impact on their chances of winning a match.
# Analyzing key player performance
key_players_performance <- ipl %>%
select(home_key_batsman, home_key_bowler, away_key_batsman, away_key_bowler, winner) %>%
pivot_longer(cols = c(home_key_batsman, home_key_bowler, away_key_batsman, away_key_bowler),
names_to = "type", values_to = "player") %>%
filter(player != "") # Assuming a blank string indicates no key player listed
# Count how often players are key players in matches
player_counts <- key_players_performance %>%
group_by(player) %>%
summarise(counts = n(), .groups = 'drop')
# Top 10 key players
top_players <- player_counts %>%
top_n(10, counts)
# Plotting top players
g <- ggplot(top_players, aes(x = reorder(player, counts), y = counts, fill = player)) +
geom_bar(stat = "identity") +
coord_flip() + # Horizontal bars for better readability
labs(title = "Top Key Players by Appearances", x = "Player", y = "Counts") +
theme(legend.position = "none") # Remove the legend to clean up the plot
# Convert ggplot object to plotly for interactivity
interactive_plot <- ggplotly(g)
# Render the interactive plot
interactive_plot
Key Takeaways
The horizontal bar chart highlights the top key player pairings by their appearance count in matches. The pairing of Chris Gayle and Virat Kohli stands out with the highest frequency, underscoring their prominence and possibly their impactful contributions to their team’s performances. Other notable pairings, such as AB de Villiers with Virat Kohli and the duo of Suresh Raina with MS Dhoni, also feature prominently, reflecting their strategic importance. The chart indicates a trend where successful team outcomes are often associated with the presence of these key players, suggesting their significant influence on the game. It also points to potential prolific partnerships that may have been pivotal in steering their teams to victory.
# Plotting interactive bar chart for toss impact
toss_effect <- ipl %>%
group_by(toss_won, decision, winner) %>%
summarise(count = n(), .groups = 'drop')
plot_toss_effect <- plot_ly(toss_effect, x = ~toss_won, y = ~count, color = ~decision, type = 'bar', text = ~paste("Winner:", winner)) %>%
layout(title = "Impact of Toss Decision on Match Outcomes",
xaxis = list(title = "Team that Won the Toss"),
yaxis = list(title = "Number of Matches Won"),
barmode = 'stack')
# To render in R Markdown
plot_toss_effect
Key Takeaways
The bar chart provides a visual analysis of how the toss decision impacts match outcomes for various IPL teams. Teams are sorted along the x-axis based on their decisions after winning the toss, split between ‘Bowl First’ and ‘Bat First’. The length of the bars corresponds to the number of matches won following each decision. Teams like Mumbai Indians (MI) show a higher number of wins when choosing to bowl first, which is a common trend among other teams as well, suggesting a preference or strategic advantage in chasing scores. Notably, the chart also shows that for some teams, such as Chennai Super Kings (CSK) and Sunrisers Hyderabad (SRH), victories are substantial regardless of the toss decision, pointing to their overall strong performance.
Let us try focusing on Team RCB - Royal Challengers Bengaluru
# Total matches by season
matches_per_season <- ipl %>%
group_by(season) %>%
summarise(Total_Matches = n())
# Plot
g <- ggplot(matches_per_season, aes(x = season, y = Total_Matches)) +
geom_line(group = 1, color = "blue") +
geom_point(color = "red", size = 3) +
labs(title = "Total Matches Played Each Season", x = "Season", y = "Matches")
ggplotly(g) # Converts ggplot2 to interactive plotly object
Key Takeaways
The line graph depicts the total number of IPL matches played in each season over the years. There’s a noticeable peak around the early 2010s, indicating a season or seasons with a particularly high number of matches. This peak is followed by a sharp decline and then a period of fluctuation, which could be attributed to changes in league format, the number of participating teams, or external factors such as events that might have led to a reduced number of games. Post-2016, there is a recovery and a return to higher match counts, suggesting a possible expansion or return to the previous format. The graph captures the dynamic nature of the league’s structure across different seasons.
# Calculate win percentages
win_percentages <- ipl %>%
group_by(winner) %>%
summarise(Wins = n()) %>%
mutate(Total = nrow(ipl), Win_Percentage = Wins / Total * 100)
# Plot
g <- ggplot(win_percentages, aes(x = reorder(winner, Win_Percentage), y = Win_Percentage, fill = winner)) +
geom_col() +
coord_flip() +
labs(title = "Winning Percentage by Team", x = "Team", y = "Win Percentage")
ggplotly(g)
Key Takeaways
Teams like Mumbai Indians (MI), Chennai Super Kings (CSK), and Kolkata Knight Riders (KKR) appear to have the highest win percentages, indicating consistent success in the league. In contrast, teams like Kochi and Pune Warriors India (PWI) have lower winning percentages, which could be due to a variety of factors, including their shorter stints in the IPL. The color-coded bars correspond to different teams, which might represent the team’s primary colors or are simply used for differentiation. This chart effectively communicates the historical performance of each team, with clear leaders and teams that may have struggled throughout their time in the IPL.
# Convert necessary columns to appropriate data types, if not already
ipl$season <- as.factor(ipl$season)
ipl$date <- as.Date(ipl$start_date, format="%d-%m-%Y")
# Aggregate data for visualization
toss_summary <- ipl %>%
group_by(decision) %>%
summarise(Count = n())
# Create an interactive pie chart
plot_ly(toss_summary, labels = ~decision, values = ~Count, type = 'pie', textinfo = 'label+percent',
insidetextorientation = 'radial') %>%
layout(title = 'Distribution of Toss Decisions (Bat vs Bowl)')
Key Takeaways
The pie chart illustrates the distribution of toss decisions between batting first and bowling first in the IPL. It’s clear that choosing to bowl first is the more prevalent decision, accounting for 64.1% of the choices, while batting first is less common at 35.9%. This could indicate a strategic preference for chasing targets, or it might reflect the conditions of the pitches and the advantages of knowing the target score. The significant skew towards bowling first may also suggest teams have more confidence in their ability to chase down runs, or it could be influenced by other factors such as evening dew affecting play, which often makes it preferable to bowl first.
# Prepare data for stacked area chart
team_performance <- ipl %>%
group_by(season, winner) %>%
summarise(Wins = n(), .groups = 'drop')
# Create interactive stacked area chart
plot_ly(team_performance, x = ~season, y = ~Wins, type = 'scatter', mode = 'lines', stackgroup = 'one', color = ~winner) %>%
layout(title = 'Team Performances Across IPL Seasons', xaxis = list(title = 'Season'), yaxis = list(title = 'Wins'))
Key Takeaways
The stacked area chart portrays the win counts of IPL teams across seasons, with different colors representing each team. A pattern of fluctuations in performance is noticeable, where some teams like Mumbai Indians (MI) and Chennai Super Kings (CSK) maintain a higher number of wins consistently, suggesting steady performance. In contrast, teams like Pune Warriors India (PWI) and Kochi show a limited presence, indicative of their brief participation in the league. The graph also illustrates the competitive nature of the league, with the changing fortunes of teams like Sunrisers Hyderabad (SRH) and Royal Challengers Bangalore (RCB) throughout the seasons. This visualization effectively captures the dynamic ebb and flow of team success in the IPL over time.
# RCB yearly performance
rcb_performance <- ipl %>%
filter(home_team == "RCB" | away_team == "RCB") %>%
group_by(season) %>%
summarise(Wins = sum(winner == "RCB", na.rm = TRUE), Matches = n()) %>%
mutate(Win_Percentage = (Wins / Matches) * 100)
# Line chart of wins over the years
plot_ly(rcb_performance, x = ~season, y = ~Win_Percentage, type = 'scatter', mode = 'lines+markers') %>%
layout(title = 'RCB Performance Over the Seasons', xaxis = list(title = 'Season'), yaxis = list(title = 'Win Percentage (%)'))
Key Takeaways
The line graph traces the win percentage of Royal Challengers Bangalore (RCB) across IPL seasons. The graph highlights RCB’s fluctuating performance, with notable peaks in the 2009, 2011, and 2016 seasons—years in which the team reached the finals but did not secure the championship. These peaks might reflect the team’s strong performance during those seasons, indicating a potential correlation between a higher win percentage during the regular season and the ability to reach the final stages of the tournament. The dips in win percentage in other years point to less successful campaigns. This visualization encapsulates RCB’s journey of close contests and their quest for an elusive title victory.
# Prepare data
match_metrics <- ipl %>%
group_by(id) %>%
summarise(TotalRuns = sum(as.numeric(home_runs), as.numeric(away_runs)), TotalWickets = sum(as.numeric(home_wickets), as.numeric(away_wickets)), .groups = 'drop')
# Create a bubble chart
plot_ly(match_metrics, x = ~TotalRuns, y = ~TotalWickets, size = ~TotalRuns, text = ~id, mode = 'markers',
marker = list(sizemode = 'area', sizeref = 0.1)) %>%
layout(title = 'Correlation Between Runs and Wickets in Matches', xaxis = list(title = 'Total Runs'), yaxis = list(title = 'Total Wickets'))
Key Takeaways
This graph is a good example of correlation doesn’t usally mean causation:
The scatter plot displays a dense cluster of data points, suggesting a possible correlation between total runs and wickets in IPL matches. The concentration of points around the middle of the graph indicates that a majority of the matches see a moderate number of runs scored and wickets taken. However, the spread of the data, particularly along the x-axis for total runs, indicates variability in match scoring. The outliers, particularly those with high run counts and low wicket numbers, might represent innings where batters dominated. Conversely, clusters with lower run counts and higher wicket numbers could depict bowler-friendly matches. The visualization provides an overview of match dynamics but does not show a clear trend or correlation, indicating that high-scoring matches don’t necessarily result in fewer wickets, and vice versa.
# Toss impact on RCB matches
rcb_toss_impact <- ipl %>%
filter(home_team == "RCB" | away_team == "RCB") %>%
group_by(toss_won, decision, winner) %>%
summarise(Count = n(), .groups = 'drop') %>%
filter(toss_won == "RCB")
# Bar chart: Toss decisions vs match outcomes for RCB
plot_ly(rcb_toss_impact, x = ~decision, y = ~Count, color = ~winner, type = 'bar') %>%
layout(title = 'Impact of Toss Decisions on RCB Match Outcomes', xaxis = list(title = 'Toss Decision'), yaxis = list(title = 'Number of Matches'))
Key Takeaways
The bar chart illustrates the impact of toss decisions on Royal Challengers Bangalore (RCB) match outcomes against various teams. It compares the number of matches RCB won when deciding to bat first versus bowl first. The chart indicates that RCB has a stark contrast in outcomes based on this decision, with significantly more wins when choosing to bowl first, especially against one team, represented by the prominent yellow bar. This suggests that RCB may have a stronger performance in chasing targets or that the conditions during these matches were more favorable for bowling first. The data also shows that wins after batting first are relatively fewer, which could reflect either a less successful strategy or a sample of fewer matches where this decision was made. Overall, the visualization underscores the importance of the toss and subsequent strategic decisions in the context of RCB’s match outcomes.
# Player of the Match awards in RCB wins
rcb_top_players <- ipl %>%
filter(winner == "RCB") %>%
group_by(pom) %>%
summarise(Awards = n(), .groups = 'drop') %>%
arrange(desc(Awards))
# Bar chart: Top RCB players by Player of the Match awards
plot_ly(rcb_top_players, x = ~pom, y = ~Awards, type = 'bar') %>%
layout(title = 'Top RCB Players by Player of the Match Awards', xaxis = list(title = 'Player'), yaxis = list(title = 'Number of Awards'))
Key Takeaways
This graph clearly depicts that RCB as a team always relied on individual brilliance over a single team of 11 players performance as they depend heavily on Virat Kohli, Chris Gayle and ABD.
rcb_performance <- ipl %>%
filter(home_team == "RCB" | away_team == "RCB") %>%
group_by(season) %>%
summarise(Wins = sum(winner == "RCB"), Matches = n()) %>%
mutate(Win_Rate = Wins / Matches * 100)
# Plot
g <- ggplot(rcb_performance, aes(x = season, y = Win_Rate)) +
geom_line(color = "darkred") +
geom_point(color = "black") +
labs(title = "RCB Win Rate Over the Seasons", x = "Season", y = "Win Rate (%)")
ggplotly(g)
Key Takeaways
The scatter plot presents the win rate percentage of Royal Challengers Bangalore (RCB) across various IPL seasons. Observing the distribution of data points, there’s a noticeable variability in RCB’s performance over the years. The win rate peaks at certain seasons, suggesting periods of strong performance, while in other seasons, the rate drops, indicating less successful campaigns. The plot doesn’t show a consistent upward or downward trend, which could imply that RCB’s success in the league fluctuates rather than follows a clear trajectory of improvement or decline.
star_performers <- ipl %>%
filter(season %in% c(2009, 2011, 2016), winner == "RCB") %>%
group_by(season, pom) %>%
summarise(Times_POM = n()) %>%
filter(Times_POM == max(Times_POM))
# Plot
g <- ggplot(star_performers, aes(x = pom, y = Times_POM, fill = as.factor(season))) +
geom_col(position = "dodge") +
labs(title = "Star Performers in 2009, 2011, 2016", x = "Player of the Match", y = "Count")
ggplotly(g)
Key Takeaways
2009, 2011 and 2016 RCB had 3 key MVPs - Jacques Kallis, Chris Gayle & Virat Kohli who played a very important role in reaching the finals of the tournament but again could’nt cross the finish line and are still waiting for their elusive taste of their very first win in IPL tournament
# Calculate run rate
run_rate_season <- ipl %>%
filter(home_team == "RCB" | away_team == "RCB") %>%
group_by(season) %>%
summarise(Total_Runs = sum(home_runs) + sum(away_runs),
Total_Overs = sum(home_overs) + sum(away_overs),
Average_Run_Rate = Total_Runs / Total_Overs)
# Plot
g <- ggplot(run_rate_season, aes(x = season, y = Average_Run_Rate)) +
geom_line(color = "green") +
geom_point(color = "red") +
labs(title = "Average Run Rate Over Seasons for RCB", x = "Season", y = "Run Rate")
ggplotly(g)
Key Takeaways
This graph clearly shows how the game of cricket has evolved over the years and batman have had a greater say/impact on how the game goes on. During the year 2016 RCB had a brilliant season where in their batsman played a key role in putting in more runs on the board and that almost made sure they are winning the game.
# Boundary comparison
boundary_stats <- ipl %>%
filter(home_team == "RCB" | away_team == "RCB") %>%
group_by(season) %>%
summarise(Total_Boundaries = sum(home_boundaries) + sum(away_boundaries))
# Plot
g <- ggplot(boundary_stats, aes(x = season, y = Total_Boundaries)) +
geom_bar(stat = "identity", fill = "purple") +
labs(title = "Total Boundaries Hit by RCB per Season", x = "Season", y = "Boundaries")
ggplotly(g)
Few extra Analysis on the IPL game over the years
# Filter data for RCB matches
rcb_matches <- ipl %>%
filter(home_team == "RCB" | away_team == "RCB")
# Calculate wins and losses
rcb_performance <- rcb_matches %>%
mutate(rcb_won = ifelse(winner == "RCB", 1, ifelse(winner == "", 0, -1))) %>%
group_by(season) %>%
summarise(total_wins = sum(rcb_won == 1, na.rm = TRUE),
total_losses = sum(rcb_won == -1, na.rm = TRUE),
no_results = sum(rcb_won == 0, na.rm = TRUE), .groups = 'drop') %>%
arrange(desc(season))
# Plotting RCB's performance over seasons
ggplot(rcb_performance, aes(x = season)) +
geom_bar(aes(y = total_wins), stat = "identity", fill = "green") +
geom_bar(aes(y = -total_losses), stat = "identity", fill = "red") +
labs(title = "RCB Season Performance (Wins/Losses)", x = "Season", y = "Number of Matches")
Key Takeaways
The bar chart visualizes Royal Challengers Bangalore (RCB)’s win-loss record over several IPL seasons. Green bars represent the number of matches won in a season, while red bars indicate the number of losses. The lengths of the bars suggest the relative success or difficulty RCB had each year. It seems that in some years RCB experienced more wins than losses, as indicated by taller green bars, while in other seasons, the red bars are more prominent, denoting a challenging season with more losses. The chart serves as a quick visual summary of the team’s performance highs and lows throughout the observed time frame.
# Venue performance
venue_performance <- ipl %>%
filter(home_team == "RCB") %>%
group_by(venue_name) %>%
summarise(wins = sum(winner == "RCB", na.rm = TRUE), matches = n(), .groups = 'drop') %>%
mutate(win_rate = wins / matches)
# Plotting venue performance
ggplot(venue_performance, aes(x = reorder(venue_name, win_rate), y = win_rate)) +
geom_bar(stat = "identity", fill = "lightblue") +
coord_flip() + # Flip coordinates for horizontal bars
labs(title = "RCB Performance by Venue", x = "Venue", y = "Win Rate")
# Convert ggplot object to plotly for interactivity
interactive_plot <- ggplotly(g)
# Render the interactive plot
interactive_plot
Introduction
In this project, I delve into the vast IPL dataset to uncover trends and insights into the performances of teams and players. My goal is to evaluate how factors such as toss decisions and individual contributions affect match outcomes. The dataset covers matches from 2008 to 2023, offering a comprehensive look at the IPL’s history through various statistical lenses and visualizations.
Project Vision
Initially, I envisioned a straightforward analysis focusing on general performance metrics. However, as the project unfolded, my vision evolved to a more nuanced inquiry—especially the strategic nuances behind toss decisions and the pivotal roles played by key figures such as Virat Kohli. This shift was prompted by intriguing patterns found in the early stages of data exploration.
Explanation of Project Expansion
What started as a module project limited to basic visualization techniques has now expanded into a broader analysis. I incorporated interactive visualizations to enable a dynamic exploration of the data. I also undertook a detailed study of team performances across seasons and scrutinized the statistical impact of players’ performances on their team’s success, thereby enhancing the depth of the original project.
Reflection and Conclusion
Upon reflection, I consider the successful visualization of complex IPL data and the narrative constructed around strategic decisions as significant accomplishments. I faced challenges in data cleaning and ensuring the interpretive accuracy of advanced statistical patterns. This project has underscored the rich stories that data can tell in sports, beyond mere numbers. It has also set the stage for future projects, where I aim to delve into predictive analytics and provide foresight into IPL match outcomes.
My Take on the analysis made
Reflecting on my experience with the INSH 5302 course, creating interactive R Markdown visualizations of the IPL dataset was incredibly enriching and enjoyable. This course vividly highlighted the importance of data storytelling, demonstrating how effectively visualized data can communicate complex insights in an intuitive and impactful way. Through this project, I deepened my understanding of R and learned to harness the power of plotly and ggplot2 to make data more accessible and engaging. The ability to explore IPL performances interactively not only solidified my data manipulation skills but also sparked my creativity in presenting data stories. It’s clear that mastering these tools enhances the narrative quality of data analysis, making it a crucial skill set for any data scientist.
Knit the completed R Markdown file as a HTML document (click the “Knit” button at the top of the script editor window) and upload it to the submission portal on Canvas.